05. Assessing Data
Assessing Data
The files
all_alpha_08.csv
and
all_alpha_18.csv
discussed in the previous pages have been provided in the workspace for you here to access. Use pandas to explore these datasets in the Jupyter Notebook below to answer the quiz questions below the notebook about these characteristics of the data:
- number of samples in each dataset
- number of columns in each dataset
- duplicate rows in each dataset
- datatypes of columns
- features with missing values
- number of non-null unique values for features in each dataset
- what those unique values are and counts for each
Workspace
This section contains either a workspace (it can be a Jupyter Notebook workspace or an online code editor work space, etc.) and it cannot be automatically downloaded to be generated here. Please access the classroom with your account and manually download the workspace to your local machine. Note that for some courses, Udacity upload the workspace files onto https://github.com/udacity , so you may be able to download them there.
Workspace Information:
- Default file path:
- Workspace type: jupyter
- Opened files (when workspace is loaded): n/a
QUIZ QUESTION: :
Find the correct count for each of the following in the 2008 dataset
ANSWER CHOICES:
Feature |
Count |
---|---|
3889 |
|
2404 |
|
4 |
|
199 |
|
26 |
|
25 |
|
1611 |
|
18 |
|
1 |
SOLUTION:
Feature |
Count |
---|---|
2404 |
|
199 |
|
25 |
|
18 |
QUIZ QUESTION: :
Find the correct count for each of the following in the 2018 dataset
ANSWER CHOICES:
Feature |
Count |
---|---|
15 |
|
1611 |
|
18 |
|
2404 |
|
2 |
|
32 |
|
0 |
SOLUTION:
Feature |
Count |
---|---|
1611 |
|
18 |
|
2 |
|
0 |
QUIZ QUESTION: :
Match the datatype for each feature (some of these may not be ideal)
ANSWER CHOICES:
Feature |
Datatype |
---|---|
bool |
|
float |
|
float |
|
string |
|
int |
|
string |
|
bool |
|
string |
|
int |
SOLUTION:
Feature |
Datatype |
---|---|
float |
|
float |
|
string |
|
string |
|
string |
|
int |
|
string |
|
string |
|
string |
|
string |
|
string |
|
string |
|
int |
QUIZ QUESTION: :
Match the number of non-null unique values for each of the following features
ANSWER CHOICES:
Feature |
Unique Values |
---|---|
3 |
|
2 |
|
5 |
|
2 |
|
42 |
|
1 |
|
18 |
|
14 |
|
3 |
SOLUTION:
Feature |
Unique Values |
---|---|
3 |
|
3 |
|
2 |
|
2 |
|
2 |
|
2 |
|
14 |
|
3 |
|
3 |
SOLUTION:
- Datatypes
- Format
- Number of unique values
QUIZ QUESTION: :
Where are each of these fuel types present?
ANSWER CHOICES:
Fuel Type |
Dataset |
---|---|
Both |
|
2018 |
|
Both |
|
2008 |
|
2008 |
|
Neither |
|
Both |
|
2018 |
SOLUTION:
Fuel Type |
Dataset |
---|---|
Both |
|
Both |
|
2018 |
|
2018 |
|
Both |
|
Both |
|
2008 |
|
2008 |
|
Both |
|
Both |
|
2018 |
|
2018 |